| Ref<br># | Hits | Search Query                                                                                                                                                                | DBs                                                               | Default<br>Operator | Plurals | Time Stamp       |
|----------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|---------------------|---------|------------------|
| L1       | 63   | partial near4 execut\$4 and ((isa)<br>(instruction adj set adj<br>architecture)) and (vliw (very adj<br>long adj instruction adj word))                                     | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 14:52 |
| L2       | 21   | ((non adj operational) (nop) (no adj<br>operation)) and ((isa) (instruction<br>adj set adj architecture)) and<br>(binary adj compatibl\$4) ((binary<br>adj incompatibl\$4)) | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:07 |
| L3       | 24   | ((isa) (instruction adj set adj<br>architecture)) and (binary adj<br>compatibl\$4)                                                                                          | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:13 |
| L4       | 17   | (pipelin\$4 adj scalar) and ((isa)<br>(instruction adj set adj<br>architecture))                                                                                            | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:14 |
| L5       | 3    | (vliw (very adj long adj instruction<br>adj word)) and ((isa) (instruction<br>adj set adj architecture)) and<br>(binary adj compatibl\$4)                                   | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:14 |
| L6       | 7    | (vliw (very adj long adj instruction<br>adj word)) and (binary adj<br>compatibl\$4)                                                                                         | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:15 |
| L7       | 24   | ((isa) (instruction adj set adj<br>architecture)) and (binary adj<br>compatibl\$4)                                                                                          | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:15 |
| L8       | 6    | (vliw (very adj long adj instruction<br>adj word)) and (superscalar<br>pipelin\$4) and (binary adj<br>compatibl\$4)                                                         | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR                  | ON      | 2008/01/28 15:15 |

|     |     | •                                                                                                                                                     |                                                                   |      |      |                  |
|-----|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|------|------|------------------|
| L10 | 104 | (vliw (very adj long adj instruction<br>adj word)) with (convert\$4<br>transform\$4 modif\$4) with<br>(processor multi\$processor)                    | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 12:45 |
| L11 | 20  | (modif\$4 edit\$4 translat\$4 transform\$4) same (run\$time runtime (run adj time)) same ((nop) (no\$operation) (no adj operation))                   | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 15:16 |
| L12 | 79  | (insert\$4) with ( ((nop) (non adj<br>operation))) same (split\$4 divid\$4<br>break\$4 separat\$4)                                                    | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 15:17 |
| L13 | 10  | ((non adj operational) (nop) (no adj<br>operation)) same (insert\$4 enter\$4)<br>same compil\$4 same (split\$5<br>divid\$4 break\$4) with instruction | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON . | 2008/01/28 15:16 |
| L14 | 42  | (insert\$4 enter\$4 (adding added add)) with ((non adj operational) (nop) (no adj operation)) same (split\$5 divid\$4 break\$4) with instruction      | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR . | ON   | 2008/01/28 14:35 |
| L16 | 8   | (modif\$4 edit\$4 translat\$4<br>transform\$4) same (run\$time<br>runtime (run adj time)) same word<br>with length                                    | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:41 |
| L18 | 2   | "6298370".pn.                                                                                                                                         | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 13:56 |
| L19 | 2   | "6799266".pn.                                                                                                                                         | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 13:56 |

| L20 | 28 | partial near4 execut\$4 and ((isa) (instruction adj set adj architecture)) and (vliw (very adj long adj instruction adj word)) and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls. "717".ccls.)                     | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:01 |
|-----|----|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|------|------|------------------|
| L21 | 14 | ((non adj operational) (nop) (no adj operation)) and ((isa) (instruction adj set adj architecture)) and (binary adj compatibl\$4) ((binary adj incompatibl\$4)) and ("717"/"140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls." | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON . | 2008/01/28 14:04 |
| L22 | 12 | ((isa) (instruction adj set adj<br>architecture)) and (binary adj<br>compatibl\$4) and ("717"/ "140"<br>717/149 717/154 717/155 717/161<br>"718"\$.ccls. "712".ccls. "717".ccls.)                                                           | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:09 |
| L23 | 2  | (pipelin\$4 adj scalar) and ((isa)<br>(instruction adj set adj<br>architecture)) and ("717"/ "140"<br>717/149 717/154 717/155 717/161<br>"718"\$.ccls. "712".ccls. "717".ccls.)                                                             | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR . | ON   | 2008/01/28 14:21 |
| L24 | 2  | (vliw (very adj long adj instruction adj word)) and ((isa) (instruction adj set adj architecture)) and (binary adj compatibl\$4) and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls. "717".ccls.)                   | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:21 |
| L25 |    | (vliw (very adj long adj instruction<br>adj word)) and (binary adj<br>compatibl\$4) and ("717"/ "140"<br>717/149 717/154 717/155 717/161<br>"718"\$.ccls. "712".ccls. "717".ccls.)                                                          | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:21 |
| L26 |    | (modif\$4 edit\$4 translat\$4 transform\$4) same (run\$time runtime (run adj time)) same ((nop) (no\$operation) (no adj operation)) and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls. "717".ccls.)                | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON   | 2008/01/28 14:22 |

| L27 | 23  | (insert\$4) with ( ((nop) (non adj<br>operation))) same (split\$4 divid\$4                                                                                                                                                                | US-PGPUB;<br>USPAT;                                               | OR   | ON | 2008/01/28 14:25 |
|-----|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|------|----|------------------|
|     |     | break\$4 separat\$4) and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls.                                                                                                                                          | USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB                        |      |    |                  |
| L28 | 2   | ((non adj operational) (nop) (no adj operation)) same (insert\$4 enter\$4) same compil\$4 same (split\$5 divid\$4 break\$4) with instruction and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls. "717".ccls.)     | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON | 2008/01/28 14:29 |
| L29 | 17  | (insert\$4 enter\$4 (adding added add)) with ((non adj operational) (nop) (no adj operation)) same (split\$5 divid\$4 break\$4) with instruction and ("717"/ "140" 717/149 717/154 717/155 717/161 "718"\$.ccls. "712".ccls. "717".ccls.) | US-PGPUB;<br>USPAT;<br>USOCR;<br>EPO; JPO;<br>DERWENT;<br>IBM_TDB | OR   | ON | 2008/01/28 14:35 |
| L31 | 7   | partial near4 execut\$4 and ((isa)<br>(instruction adj set adj<br>architecture)) and (vliw (very adj<br>long adj instruction adj word)).clm.                                                                                              | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:04 |
| L33 | 5   | ((non adj operational) (nop) (no adj operation)) and ((isa) (instruction adj set adj architecture)) and (binary adj compatibl\$4) ((binary adj incompatibl\$4)).clm.                                                                      | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:07 |
| L34 | . 1 | ((isa) (instruction adj set adj<br>architecture)) and (binary adj<br>compatibl\$4).clm.                                                                                                                                                   | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:14 |
| L35 | . 2 | (pipelin\$4 adj scalar) and ((isa)<br>(instruction adj set adj<br>architecture)).clm.                                                                                                                                                     | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:14 |
| L38 | 1   | ((isa) (instruction adj set adj<br>architecture)) and (binary adj<br>compatibl\$4).clm.                                                                                                                                                   | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:15 |
| L40 | 1   | ((non adj operational) (nop) (no adj<br>operation)) same (insert\$4 enter\$4)<br>same compil\$4 same (split\$5<br>divid\$4 break\$4) with instruction.<br>clm.                                                                            | US-PGPUB                                                          | OR . | ON | 2008/01/28 15:16 |
| L42 | 2   | (insert\$4) with ( ((nop) (non adj<br>operation))) same (split\$4 divid\$4<br>break\$4 separat\$4).clm.                                                                                                                                   | US-PGPUB                                                          | OR   | ON | 2008/01/28 15:18 |



Subscribe (Full Service) Register (Limited Service, Free) Login

The ACM Digital Library

C The Guide

+vliw +multiprocessor

لنواتلاك

### THE ACM DIGITAL LIBRARY

Feedback

+vliw +multiprocessor

Terms used: vliw multiprocessor

Found 631 of 238,273

Sort results

by

relevance

-

Save results to a Binder

Refine these results with Advanced

Try this search in The ACM Guide

Display results

expanded form

Open results in a new window

Results 1 - 20 of 631

Result page: 1 2 3 4 5 6

7

next

>>

Ada by Google

GIS Image

1 Fast and Accurate Multiprocessor Architecture Exploration with Symbolic Programs

Vladimir D. Zivkovic, Erwin de Kock, Pieter van der Wolf, Ed Deprettere March 2003 DATE '03: Proceedings of the conference on Design, Automation and Test in Europe - Volume 1, Volume 1

Publisher: IEEE Computer Society

Additional Information: full citation,

Full text available: pdf(190.64 KB) Publisher Site

abstract, cited by, index terms

**Segmentation** Shapefiles from satellite imagery Wizard to segment, classify, batch

ImageSeg.com

Storage Area

Networks

In system-level platform-based embedded systems design, the mapping model is a crucial link between the application model and the architecture model. All three models must match when design-space exploration has to be fast and accurate, and when exploration ...

> Solution Overview Free PDF

Download: Ciena www.ciena.com

White Paper. SAN

2 Increasing on-chip memory space utilization for embedded chip

multiprocessors through data compression

Ozcan Ozturk, Mahmut Kandemir, Mary Jane Irwin

September 2005 CODES+ISSS '05: Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign

and system synthesis

Publisher: ACM

Full text available: pdf(229.40 KB) Additional Information: full citation, abstract,

references, index terms

Minimizing the number of off-chip memory references is very important in chip multiprocessors from both the performance and power perspectives. To achieve this the distance between successive reuses of the same data block must be reduced. However, this ...

Keywords: chip multiprocessors, data compression, optimizing compiler

Live Unified Corp Models

This business tool delivers what everyone wants live on-demand www.liveunimodels.com

Mathematica <u>Algorithms</u> **Developing Custom** 

Algorithms For Your Research Needs. Contact Us!

www.scienceops.com

3 VLIW: a case study of parallelism verification

Allon Adir, Yaron Arbetman, Bella Dubrov, Yossi Lichtenstein, Michal Rimon, Michael Vinov, Massimo A. Calligaro, Andrew Cofler, Gabriel Duffy June 2005 **DAC '05:** Proceedings of the 42nd annual conference on Design automation

**Publisher: ACM** 

Full text available: pdf(355.75 KB) Additional Information: full citation, abstract, references, index terms

Parallelism in processor architecture and design imposes a verification challenge as the exponential growth in the number of execution combinations becomes ...

**Keywords**: VLIW, functional verification, parallelism, processor verification, test generation

4 Architecture and programming of a VLIW style programmable video

signal processor

G. Essink, E. Aarts, R. van Dongen, P. van Gerwen, J. Korst, K. Vissers September 1991 **MICRO 24:** Proceedings of the 24th annual international symposium on Microarchitecture

Publisher: ACM

Full text available: pdf(876.55 KB) Additional Information: full citation, references, cited by, index terms

5 The design of a RISC based multiprocessor chip

Rajiv Gupta, Michael Epstein, Michael Whelan

November 1990 **Supercomputing '90:** Proceedings of the 1990 ACM/IEEE conference on Supercomputing

Publisher: IEEE Computer Society

Full text available: pdf(1.10 MB) Additional Information: full citation, abstract, references

This paper describes the architecture of a RISC based multiprocessor chip. The processors operate in a MIMD fashion executing parallel instruction streams generated by a parallelizing compiler for the exploitation of fine-grained parallelism. Low cost ...

**Keywords**: collective branching, fuzzy barrier, parallelizing compiler, register channels, very long instruction word (VLIW) architectures

6 An integer linear programming based approach for parallelizing

applications in On-chip multiprocessors

I. Kadayif, M. Kandemir, U. Sezer June 2002 **DAC '02:** Proceedings of the 39th conference on Design

automation

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(174.70 KB)

references, cited by, index terms

With energy consumption becoming one of the first-class optimization parameters in computer system design, compilation techniques that consider performance and energy simultaneously are expected to play a central role. In particular, compiling a given ...

**Keywords**: constraint-based compilation, embedded systems, loop-Level parallelism

7

Seamless - a latency-tolerant RISC-based multiprocessor



#### architecture (abstract)

Samuel A. Fineberg, Thomas L. Casavant, Brent H. Pease May 1992 ISCA '92: ACM SIGARCH Computer Architecture News,

Volume 20 Issue 2

Publisher: ACM

Full text available: pdf(57.30 KB) Additional Information: full citation, abstract, index terms

The Seamless parallel system being developed at the University of Iowa ECE Department provides a method for providing latency tolerance in physically-distributed memory systems utilizing "off-the-shelf" RISC CPU's without incurring the overhead ...

8 Design space exploration using arithmetic-level hardware--software

cosimulation for configurable multiprocessor platforms
Jingzhao Ou, Viktor K. Prasanna
May 2006 ACM Transaction for configurable multiprocessor platforms

May 2006 ACM Transactions on Embedded Computing Systems (TECS), Volume 5 Issue 2

Publisher: ACM

Full text available: pdf(814.20 KB) Additional Information: full citation, abstract, references, index terms

Configurable multiprocessor platforms consist of multiple soft processors configured on FPGA devices. They have become an attractive choice for implementing many computing applications. In addition to the various ways of distributing software execution ...

Keywords: FPGA, cosimulation, design space exploration, processor

9 Transient-fault recovery for chip multiprocessors

Mohamed Gomaa, Chad Scarbrough, T. N. Vijaykumar, Irith Pomeranz June 2003 **ISCA '03:** Proceedings of the 30th annual international symposium on Computer architecture

Publisher: ACM

Full text available: pdf(370.75 KB) Additional Information: full citation, abstract, references, cited by

To address the increasing susceptibility of commodity chip multiprocessors (CMPs) to transient faults, we propose Chiplevel Redundantly Threaded multiprocessor with Recovery (CRTR). CRTR extends the previously-proposed CRT for transient-fault detection ...

10 Design and programming of embedded multiprocessors: an interface-

centric approach

Pieter van der Wolf, Erwin de Kock, Tomas Henriksson, Wido Kruijtzer, Gerben Essink

September 2004 **CODES+ISSS '04:** Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(377.96 KB) references, cited by, index terms

We present design technology for the structured design and programming of embedded multi-processor systems. It comprises a

task-level interface that can be used both for developing parallel application models and as a platform interface for implementing ...

**Keywords:** code transformation, media processing, multiprocessor mapping, platform interface, system design method, task-level interface

### 11 A resource-shared VLIW processor architecture for area-efficient on-

chip multiprocessing

Kazutoshi Kobayashi, Masao Aramoto, Yoichi Yuyama, Akihiko Higuchi, Hidetoshi Onodera

January 2005 **ASP-DAC '05:** Proceedings of the 2005 conference on Asia South Pacific design automation

Publisher: ACM

Full text available: pdf(412.71 KB) Additional Information: full citation, abstract, references

We propose an area-efficient resource-shared VLIW processor (RSVP) for future leaky nm process technologies. It consists of several single-way independent processor units (IPUs) that share parallel processor resources. Each IPU works as a variable-way ...

### 12 A run-time word-level reconfigurable coarse-grain functional unit for a

VLIW processor

Natalino G. Busá, Carles Rodoreda Sala

October 2002 **ISSS '02:** Proceedings of the 15th international symposium on System Synthesis

Publisher: ACM

Full text available: pdf(492.13 KB) Additional Information: full citation, abstract, references, index terms

Nowadays, new DSP applications are offering combined and flexible multimedia and telecom services. VLIW processor architectures, which include dedicated but inflexible functional units, are usually tuned to a single specific application. In order to ...

**Keywords**: VLIW processors, architectural synthesis, reconfigurable logic

### 13 The future of multiprocessor systems-on-chips

Wayne Wolf

June 2004 **DAC '04:** Proceedings of the 41st annual conference on Designautomation

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(58.72 KB) references, cited by, index

This paper surveys the state-of-the-art and pending challenges in MPSoC design. Standards in communications, multimedia, networking, and other areas encourage the development of high-performance platforms that can support a range of implementations of ...

**Keywords**: MPSoC, embedded software, low power, real-time, system-on-chip

14 <u>Transient-fault recovery for chip multiprocessors</u>

Mohamed Gomaa, Chad Scarbrough, T. N. Vijaykumar, Irith Pomeranz May 2003 ISCA '03: ACM SIGARCH Computer Architecture News,

Volume 31 Issue 2

Publisher: ACM

Full text available: pdf(370.75 KB) Additional Information: full citation, abstract, references, cited by

To address the increasing susceptibility of commodity chip multiprocessors (CMPs) to transient faults, we propose Chiplevel Redundantly Threaded multiprocessor with Recovery (CRTR). CRTR extends the previously-proposed CRT for transient-fault detection ...

### 15 Multiprocessor mapping of process networks: a JPEG decoding case

study

E. A. de Kock

October 2002 ISSS '02: Proceedings of the 15th international symposium on System Synthesis

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(333.53 KB)

references, cited by, index

We present a system-level design and programming method for embedded multiprocessor systems. The aim of the method is to improve the design time and design quality by providing a structured approach for implementing process networks. We use process networks ...

**Keywords**: code transformation, data parallelism, multiprocessor mapping, process network, system design method, task-level parallelism

16 A VLIW architecture for a trace scheduling compiler

Robert P. Colwell, Robert P. Nix, John J. O'Donnell, David B. Papworth, Paul

November 1987 ASPLOS-II: ACM SIGARCH Computer Architecture News, Volume 15 Issue 5

Publisher: ACM

Full text available: pdf(1.59 MB) Additional Information: full citation, abstract, references,

cited by, index terms

Very Long Instruction Word (VLIW) architectures were promised to deliver far more than the factor of two or three that current architectures achieve from overlapped execution. Using a new type of compiler which compacts ordinary sequential code into ...

### 17 Contribution of Compilation Techniques to the Synthesis of Dedicated **VLIW Architectures**

G. Menez, Michel Auguin, Fernand Boeri, C. Carrière

January 1993 PACT '93: Proceedings of the IFIP WG10.3. Working

Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism

Publisher: North-Holland Publishing Co.

Additional Information: full citation, references, cited by

### 18 Heterogeneous multiprocessor implementations for JPEG:: a case

study

Seng Lin Shee, Andrea Erdos, Sri Parameswaran October 2006 CODES+ISSS '06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis

Publisher: ACM

Additional Information: full citation, abstract, Full text available: pdf(225.70 KB) references, index terms

Heteregenous multiprocessor SoCs are becoming a reality, largely due to the abundance of transistors, intellectual property cores and powerful design tools. In this project, we explore the use of multiple cores to speed up the JPEG compression algorithm. ...

### 19 Instruction architecture of an aerospace multiprocessor

James S. Miller, Woodrow H. Vandever, Jr.

November 1973 ACM SIGPLAN Notices, Volume 8 Issue 11

Publisher: ACM

Full text available: pdf(598:38 KB)

Additional Information: full citation, abstract, references, index terms

This paper describes the architecture, the data forms, and the instruction set of a multiprocessor computer designed to provide the central computational facilities of a long-lifetime orbiting space laboratory. An overview of the system [M73] describes ...

### 20 Instruction <u>architecture of an aerospace multiprocessor</u>

James S. Miller, Woodrow H. Vandever, Jr.

November 1973 Proceedings of the ACM-IEEE symposium on High-levellanguage computer architecture

Publisher: ACM

Full text available: pdf(598.38 KB) Additional Information: full citation, abstract,

references, index terms

This paper describes the architecture, the data forms, and the instruction set of a multiprocessor computer designed to provide the central computational facilities of a long-lifetime orbiting space laboratory. An overview of the system [M73] describes ...

Results 1 - 20 of 631 Result page: **1** <u>2</u> <u>3</u> <u>4</u> <u>5</u> <u>6</u> <u>7</u> <u>8</u> <u>9</u> <u>10</u> next

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc. Terms of Usage Privacy Policy Code of Ethics Contact Us

Useful downloads: Adobe Acrobat QuickTime Windows Media Player

Subscribe (Full Service) Register (Limited Service, Free) Login

• The ACM Digital Library Search: C The Guide

+vliw +nop

delice:

### THE ACM DIGITAL LIBRARY.

Feedback

+vliw +nop

Terms used: vliw nop

Found 128 of 238,273

Sort results

by

results

relevance Display

expanded form

Save results to a Binder

Open results in a new

Refine these results with Advanced

Try this search in The ACM Guide

Results 1 - 20 of 128 -

Result page:  $1 \quad \underline{2} \quad \underline{3}$ >>

Compiler-directed thermal management for VLIW functional units

window

Madhu Mutyam, Feihui Li, Vijaykrishnan Narayanan, Mahmut Kandemir, Mary Jane Irwin

June 2006 LCTES '06: Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems

Publisher: ACM

Full text available: pdf(599.24 KB) Additional Information: full citation, abstract, references, index terms

As processors, memories, and other components of today's embedded systems are pushed to higher performance in more enclosed spaces, processor thermal management is quickly becoming a limiting design factor. While previous proposals mostly approached ...

Keywords: IPC, VLIW, thermal

Ads by Google

**GIS Image Segmentation** Shapefiles from satellite imagery Wizard to segment, classify, batch ImageSeg.com

Storage Area **Networks** White Paper, SAN Solution Overview Free PDF Download: Ciena

www.ciena.com

VLIW: a case study of parallelism verification

Allon Adir, Yaron Arbetman, Bella Dubrov, Yossi Lichtenstein, Michal Rimon, Michael Vinov, Massimo A. Calligaro, Andrew Cofler, Gabriel Duffy June 2005 DAC '05: Proceedings of the 42nd annual conference on Design automation

Publisher: ACM

Full text available: pdf(355.75 KB) Additional Information: full citation, abstract, references, index terms

Parallelism in processor architecture and design imposes a verification challenge as the exponential growth in the number of execution combinations becomes ...

Keywords: VLIW, functional verification, parallelism, processor verification, test generation

Automatic formal verification for scheduled VLIW code

Xiushan Feng, Alan J. Hu

June 2002 LCTES/SCOPES '02: Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software

and compilers for embedded systems

Publisher: ACM

Additional Information: full citation, abstract,

#### Live Unified Corp Models

This business tool delivers what everyone wants live on-demand www.liveunimodels.com

Mathematica **Algorithms** Developing Custom Algorithms For Your Research Needs. Contact Us! www.scienceops.com

Full text available: pdf(113.92 KB)

references, cited by, index terms

VLIW processors are attractive for many embedded applications, but VLIW code scheduling, whether by hand or by compiler, is extremely challenging. In this paper, we extend previous work on automated verification of low-level software to handle the complexity ...

Keywords: DSP, VLIW, formal verification, symbolic execution, theory of equality with uninterpreted functions

Branch prediction techniques for low-power VLIW processors

G. Palermo, M. Sam, C. Silvan, V. Zaccari, R. Zafalo

April 2003 GLSVLSI '03: Proceedings of the 13th ACM Great Lakes symposium on VLSI

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(178.89 KB)

references, cited by, index

Main goal of the paper is to introduce a branch prediction scheme suitable for energy-efficient VLIW (Very Long Instruction Word) processors aiming at reducing the energy associated with the prediction phase by filtering the accesses to the branch predictor ...

**Keywords**: VLIW processors, branch prediction, low-power design

5 Loop fusion for clustered VLIW architectures

Yi Qian, Steve Carr, Philip Sweany
July 2002 LCTES/SCOPES '02: ACM SIGPLAN Notices, Volume 37 Issue 7

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(111.58 KB)

references, cited by, index

Embedded systems require maximum performance from a processor within significant constraints in power consumption and chip cost. Using software pipelining, high-performance digital signal processors can often exploit considerable instruction-level parallelism ...

**Keywords**: clustered VLIW architectures, loop fusion

6 A resource-shared VLIW processor architecture for area-efficient on-

chip multiprocessing

Kazutoshi Kobayashi, Masao Aramoto, Yoichi Yuyama, Akihiko Higuchi, Hidetoshi Onodera

January 2005 ASP-DAC '05: Proceedings of the 2005 conference on Asia South Pacific design automation

Publisher: ACM

Full text available: pdf(412.71 KB) Additional Information: full citation, abstract,

references

We propose an area-efficient resource-shared VLIW processor (RSVP) for future leaky nm process technologies. It consists of several single-way independent processor units (IPUs) that share parallel processor resources. Each IPU works as a variable-way ...

### 7 Software Pipelining Irregular Loops On the TMS320C6000 VLIW

DSP Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 OM '01: ACM SIGPLAN Notices, Volume 36 Issue 8

Publisher: ACM

Full text available: pdf(88.54 KB) Additional Information: full citation, abstract, references, index terms

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to *regular* ...

**Keywords**: VLIW architectures, WHILE loops, digital signal processors, software pipelining

### 8 VLIW-DLX simulator for educational purposes

Miloš Bečvář, Stanislav Kahánek

June 2007 **WCAE '07:** Proceedings of the 2007 workshop on Computer architecture education

Publisher: ACM

Full text available: pdf(381.06 KB) Additional Information: full citation, abstract, references, index terms

VLIW-DLX is graphical simulator of simple VLIW processor. It is targeted to be used in undergraduate computer architecture courses. VLIW-DLX uses similar GUI to well-known WinDLX simulator and its ISA uses scalar DLX instructions as the building blocks. ...

**Keywords**: VLIW, computer architecture, education, simulation

### 9 Reducing code size for heterogeneous-connectivity-based VLIW

DSPs through synthesis of instruction set extensions
Partha Biswas, Nikil Dutt

October 2003 **CASES '03:** Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems

Publisher: ACM

Full text available: pdf(176.82 KB) Additional Information: full citation, abstract, references, index terms

VLIW DSP architectures exhibit heterogeneous connections between functional units and register files for speeding up special tasks. Such architectural characteristics can be effectively exploited through the use of complex instruction set extensions ...

**Keywords**: dependence conflict graph, heterogeneous-connectivity-based DSP, instruction set architecture, instruction set extensions, restricted data dependence graph, static single assignment

### 10 Software Pipelining Irregular Loops On the TMS320C6000 VLIW

DSP Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems

http://portal.acm.org/results.cfm?coll=ACM&dl=ACM&CFID=52140232&CFTOKEN=13577987

Publisher: ACM

Additional Information: full citation, abstract,

pdf(88.54 KB)

references, index terms

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to regular ...

**Keywords**: VLIW architectures, WHILE loops, digital signal processors, software pipelining

### 11 Loop fusion for clustered VLIW architectures

٩

Yi Qian, Steve Carr, Philip Sweany

June 2002 **LCTES/SCOPES '02:** Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(111.58 KB)

references, cited by, index

terms

Embedded systems require maximum performance from a processor within significant constraints in power consumption and chip cost. Using software pipelining, high-performance digital signal processors can often exploit considerable instruction-level parallelism ...

Keywords: clustered VLIW architectures, loop fusion

12 Optimal integrated code generation for clustered VLIW architectures

٠

Christoph Kessler, Andrzej Bednarski

June 2002 LCTES/SCOPES '02: Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(227.07 KB)

references, cited by, index

terms

In contrast to standard compilers, generating code for DSPs can afford spending considerable resources in time and space on optimizations. Generating efficient code for irregular architectures requires an integrated method that optimizes simultaneously ...

**Keywords**: dynamic programming, instruction scheduling, instruction selection, integrated code generation, register allocation, space profile

13 Methodology for operation shuffling and L0 cluster generation for low

٨

energy heterogeneous VLIW processors

Yuki Kobayashi, Murali Jayapala, Praveen Raghavan, Francky Catthoor, Masaharu Imai

September 2007 ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 12 Issue 4

Publisher: ACM

Full text available: pdf(1.56 MB) Additional Information: full citation, abstract, references, index terms

Clustering L0 buffers is effective for energy reduction in the instruction memory hierarchy of embedded VLIW processors. However, the efficiency of the clustering depends on the schedule of the target

application. Especially in heterogeneous or data ...

Keywords: Compilers for low energy, VLIW processors, loop buffers

14 Instantaneous current modeling in a complex VLIW processor core

Radu Muresan, Catherine Gebotys

May 2005 ACM Transactions on Embedded Computing Systems (TECS), Volume 4 Issue 2

Publisher: ACM

Full text available: pdf(3.64 MB) Additional Information: full citation, abstract, references, cited by, index terms

Measuring and modeling instantaneous current consumption or current dynamics of a processor is important in embedded system designs, wireless communications, low-energy mobile computing, security of communications, and reliability. In this paper, we ...

**Keywords**: Instruction-level current model, current and power measurement in a processor, instantaneous current model, power and energy model

15 <u>Distributed loop controller architecture for multi-threading in uni-</u> threaded VLIW processors

Praveen Raghavan, Andy Lambrechts, Murali Jayapala, Francky Catthoor, Diederik Verkest

March 2006 **DATE '06:** Proceedings of the conference on Design, automation and test in Europe: Proceedings

Publisher: European Design and Automation Association

Full text available: pdf(241.43 KB) Additional Information: full citation, abstract, references

Reduced energy consumption is one of the most important design goals for embedded application domains like wireless, multimedia and biomedical. Instruction memory hierarchy has been proven to be one of the most power hungry parts of the system. This ...

16 Tailoring pipeline bypassing and functional unit mapping to

application in clustered VLIW architectures

Marcio Buss, Rodolfo Azevedo, Paulo Centoducatte, Guido Araujo November 2001 **CASES '01:** Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(213.62 KB) references, cited by, index

In this paper we describe a design exploration methodology for clustered VLIW architectures. The central idea of this work is a set of three techniques aimed at reducing the cost of expensive inter-cluster copy operations. Instruction scheduling is performed ...

Loop scheduling with timing and switching-activity minimization for VLIW DSP



Zili Shao, Bin Xiao, Chun Xue, Qingfeng Zhuge, Edwin H.-M. Sha January 2006 ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 11 Issue 1

Publisher: ACM

Full text available: Tpdf(740.56 KB) Additional Information: full citation, abstract, references, index terms

In embedded systems, high-performance DSP needs to be performed not only with high-data throughput but also with low-power consumption. This article develops an instruction-level loop-scheduling technique to reduce both execution time and bus-switching ...

**Keywords**: VLIW, compilers, instruction bus optimization, instruction scheduling, loops, low-power optimization, retiming, software pipelining

### 18 Automatic formal verification for scheduled VLIW code

🌦 Xiushan Feng, Alan J. Hu

July 2002 LCTES/SCOPES '02: ACM SIGPLAN Notices, Volume 37 Issue 7 Publisher: ACM

Additional Information: full citation, abstract, Full text available: 7 pdf(113.92 KB) references, cited by, index

terms

VLIW processors are attractive for many embedded applications, but VLIW code scheduling, whether by hand or by compiler, is extremely challenging. In this paper, we extend previous work on automated verification of low-level software to handle the complexity ...

Keywords: DSP, VLIW, formal verification, symbolic execution, theory of equality with uninterpreted functions

### 19 Software Pipelining Irregular Loops On the TMS320C6000 VLIW

DSP Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 OM '01: Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems

Publisher: ACM

Full text available: pdf(88.54 KB) Additional Information: full citation, abstract, references, index terms

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to regular ...

Keywords: VLIW architectures, WHILE loops, digital signal processors, software pipelining

### 20 Optimal integrated code generation for clustered VLIW architectures

Christoph Kessler, Andrzej Bednarski

July 2002 LCTES/SCOPES '02: ACM SIGPLAN Notices, Volume 37 Issue 7 Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(227.07 KB) references, cited by, index terms

In contrast to standard compilers, generating code for DSPs can afford spending considerable resources in time and space on optimizations. Generating efficient code for irregular architectures requires an

integrated method that optimizes simultaneously ...

**Keywords**: dynamic programming, instruction scheduling, instruction selection, integrated code generation, register allocation, space profile

Results 1 - 20 of 128

Result page: 1  $\underline{2}$   $\underline{3}$   $\underline{4}$   $\underline{5}$   $\underline{6}$   $\underline{7}$   $\underline{next}$   $\geq \geq$ 

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

<u>Terms of Usage Privacy Policy Code of Ethics Contact Us</u>

Useful downloads: Adobe Acrobat Q QuickTime Windows Media Player Real Player



Subscribe (Full Service) Register (Limited Service, Free) Login

Search: The ACM Digital Library C The Guide

+vliw +modification +parallel

deliner:

### THE ACM DIGITAL LIBRARY

Feedback

+vliw +modification +parallel
Terms used: vliw modification parallel

Found **305** of **238,273** 

Sort results by relevance 
Display results expanded form

Save results to a Binder

Open results in a new window

Refine these results with <u>Advanced</u>

<u>Search</u>

Try this search in The ACM Guide

Results 1 - 20 of 305

Result page:  $1 \quad \underline{2} \quad \underline{3} \quad \underline{4} \quad \underline{5} \quad \underline{6} \quad \underline{7} \quad \underline{8} \quad \underline{9} \quad \underline{10} \quad \underline{\text{next}} \quad >>$ 

Flexible Compiler-Managed L0 Buffers for Clustered VLIW Processors Enric Gibert, Jesús Sánchez, Antonio González

December 2003 MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

Publisher: IEEE Computer Society

Full text available: pdf(167.74 KB) Additional Information: full citation, abstract, cited by, index terms

Wire delays are a major concern for current and forthcoming processors. One approach to attack this problem is to divide the processorinto semi-independent units referred to as clusters. Acluster usually consists of a local register file and a subset ...

2 Implicitly parallel programming models for thousand-core

microprocessors

Wen-mei Hwu, Shane Ryoo, Sain-Zee Ueng, John H. Kelm, Isaac Gelado, Sam S. Stone, Robert E. Kidd, Sara S. Baghsorkhi, Aqeel A. Mahesri, Stephanie C. Tsao, Nacho Navarro, Steve S. Lumetta, Matthew I. Frank, Sanjay J. Patel June 2007 **DAC '07:** Proceedings of the 44th annual conference on Design

automation **Publisher:** ACM

Full text available: pdf(600.21 KB) Additional Information: full citation, abstract, references, index terms

This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm-level parallelism, ...

Keywords: parallel programming

3 Efficient event-driven simulation of parallel processor architectures

Alexey Kupriyanov, Dmitrij Kissler, Frank Hannig, Jürgen Teich April 2007 **SCOPES '07:** Proceedingsof the 10th international workshop on Software & compilers for embedded systems

Publisher: ACM

Full text available: pdf(619.62 KB) Additional Information: full citation, abstract, references

In this paper we present a new approach for generating high-speed optimized event-driven instruction set level simulators for adaptive massively parallel processor architectures. The simulator generator is part

Ade by Google

Document
Scanning Service
Free Online Quote.
Scan to PDF/TIF
Serving the DC
Metropolitan Area
www.ignitedscanning.com

GIS Image
Segmentation
Shapefiles from
satellite imagery
Wizard to segment,
classify, batch
ImageSeg.com

Storage Area
Networks
Free Ciena White
Paper [pdf] SAN
Integration &
Network Solutions
www.ciena.com

Live Unified Corp Models This business tool delivers what everyone wants live on-demand www.liveunimodels.com of a methodology for the systematic mapping, ...

**Keywords**: embedded tools, modeling, processor arrays, simulation

4 Automatic formal verification for scheduled VLIW code

Xiushan Feng, Alan J. Hu

June 2002 **LCTES/SCOPES '02:** Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems

Publisher: ACM

Full text available: pdf(113.92 KB) Additional Information: full citation, abstract, references, cited by, index terms

VLIW processors are attractive for many embedded applications, but VLIW code scheduling, whether by hand or by compiler, is extremely challenging. In this paper, we extend previous work on automated verification of low-level software to handle the complexity ...

**Keywords**: DSP, VLIW, formal verification, symbolic execution, theory of equality with uninterpreted functions

<sup>5</sup> A fast parallel reed-solomon decoder on a reconfigurable architecture

Arezou Koohi, Nader Bagherzadeh, Chengzi Pan

October 2003 **CODES+ISSS '03:** Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Publisher: ACM

Full text available: pdf(292.18 KB) Additional Information: full citation, abstract, references, cited by, index terms

This paper presents a software implementation of a very fast parallel Reed-Solomon decoder on the second generation of MorphoSys reconfigurable computation platform, which is targeting on streamed applications such as multimedia and DSP. Numerous modifications ...

**Keywords**: Berlekamp algorithm, Chein search, Reed-Solomon codes, SIMD processor, reconfigurable architecture

6 Interactive presentation: Time-constrained clustering for DSE of clustered VLIW-ASP

Mario Schölzel

April 2007 **DATE '07:** Proceedings of the conference on Design, automation and test in Europe

Publisher: EDA Consortium

Full text available: pdf(277.07 KB) Additional Information: full citation, abstract, references

In this paper we describe a new time-constrained clustering algorithm. It is coupled with a time-constrained scheduling algorithm and used for Design-Space-Exploration (DSE) of clustered VLIW processors with heterogeneous clusters and heterogeneous functional ...

- 7 A scalable wide-issue clustered VLIW with a reconfigurable interconnect
- Osvaldo Colavin, Davide Rizzo
  October 2003 CASES '03: Proceedings of the 2003 international conference on
  Compilers, architecture and synthesis for embedded systems

Publisher: ACM

Full text available: 🔁 pdf(365.26 KB) Additional Information: full citation, abstract, references, index terms

Clustered VLIW architectures have been widely adopted in modern embedded multimedia applications for their ability to exploit high degrees of ILP with reasonable trade-off in complexity and silicon costs. Studies have however shown limited performance ...

Keywords: IDCT, clustered VLIW, modulo scheduling, reconfigurable coprocessor (RCP)

### A run-time word-level reconfigurable coarse-grain functional unit for a

VLIW processor

Natalino G. Busá, Carles Rodoreda Sala

October 2002 ISSS '02: Proceedings of the 15th international symposium on System Synthesis

Publisher: ACM

Full text available: pdf(492.13 KB) Additional Information: full citation, abstract, references,

Nowadays, new DSP applications are offering combined and flexible multimedia and telecom services. VLIW processor architectures, which include dedicated but inflexible functional units, are usually tuned to a single specific application. In order to ...

Keywords: VLIW processors, architectural synthesis, reconfigurable logic

### A portable interface for on-the-fly instruction space modification

David Keppel

April 1991 ASPLOS-IV: Proceedings of the fourth international conference on Architectural support for programming languages and operating systems

Publisher: ACM

Full text available: pdf(1.01 MB) Additional Information: full citation, references, cited by, index terms

### 10 A portable interface for on-the-fly instruction space modification

David Keppel
April 1991 ASPLOS-IV: ACM SIGPLAN Notices, Volume 26 Issue 4

Publisher: ACM

Full text available: pdf(1.01 MB) Additional Information: full citation, references, cited by,

### 11 Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP

Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 OM '01: ACM SIGPLAN Notices, Volume 36 Issue 8

Publisher: ACM

Additional Information: <u>full citation</u>, <u>abstract</u>, <u>references</u>, Full text available: pdf(88.54 KB)

The TMS320C6000 architecture is a leading family of Digital Signal

Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to *regular* ...

**Keywords**: VLIW architectures, WHILE loops, digital signal processors, software pipelining

#### 12 VLIW-DLX simulator for educational purposes

Miloš Bečvář, Stanislav Kahánek

June 2007 WCAE '07: Proceedings of the 2007 workshop on Computer

architecture education

Publisher: ACM

Full text available: pdf(381.06 KB) Additional Information: full citation, abstract, references,

index terms

VLIW-DLX is graphical simulator of simple VLIW processor. It is targeted to be used in undergraduate computer architecture courses. VLIW-DLX uses similar GUI to well-known WinDLX simulator and its ISA uses scalar DLX instructions as the building blocks. ...

**Keywords**: VLIW, computer architecture, education, simulation

### 13 Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP

Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 LCTES '01: Proceedings of the ACM SIGPLAN workshop on

Languages, compilers and tools for embedded systems

Publisher: ACM

Full text available: pdf(88.54 KB) Additional Information: full citation, abstract, references, index terms

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to *regular* ...

**Keywords**: VLIW architectures, WHILE loops, digital signal processors, software pipelining

### 14 Compiling to a VLIW fragment pipeline

🥁 William R. Mark, Kekoa Proudfoot

August 2001 **HWWS '01:** Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware

Publisher: ACM

Full text available: pdf(144.85 KB) Additional Information: full citation, abstract, references, cited by, index terms

The latest generation of graphics hardware supports fully programmable vertex and pixel/fragment operations, but programming this hardware at a low level is difficult and time consuming. To address this problem, we have developed a complete real-time ...

### 15 Cluster assignment for high-performance embedded VLIW processors

Viktor S. Lapinskii, Margarida F. Jacome, Gustavo A. De Veciana
July 2002 ACM Transactions on Design Automation of Electronic Systems

(TODAES), Volume 7 Issue 3

Publisher: ACM

Full text available: pdf(226.89 KB) Additional Information: full citation, abstract, references, cited by, index terms

Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with a large number of register file ports. Efficient utilization of a clustered datapath requires careful binding/assignment ...

**Keywords**: Operation binding, clustered VLIW datapaths, embedded processors, embedded systems, partitioning

### 16 Tailoring pipeline bypassing and functional unit mapping to application

in clustered VLIW architectures

Marcio Buss, Rodolfo Azevedo, Paulo Centoducatte, Guido Araujo November 2001 **CASES '01:** Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems

Publisher: ACM

Full text available: pdf(213.62 KB) Additional Information: full citation, abstract, references, cited by, index terms

In this paper we describe a design exploration methodology for clustered VLIW architectures. The central idea of this work is a set of three techniques aimed at reducing the cost of expensive inter-cluster copy operations. Instruction scheduling is performed ...

# 17 Synthesizable HDL generation method for configurable VLIW processors

Yuki Kobayashi, Shinsuke Kobayashi, Koji Okuda, Keishi Sakanushi, Yoshinori Takeuchi, Masaharu Imai

January 2004 **ASP-DAC '04:** Proceedings of the 2004 conference on Asia South Pacific design automation: electronic design and solution fair

Publisher: IEEE Press

Full text available: Publisher Site

Additional Information: full citation,

Publisher Site

abstract,
references

This paper proposes a synthesizable HDL code generation method using a processor specification description. The proposed approach can change the number of slots and pipeline stages, and dispatching rule to assign operations to resources. In addition, ...

### 18 Automatic formal verification for scheduled VLIW code

💫 Xiushan Feng, Alan J. Hu

July 2002 LCTES/SCOPES '02: ACM SIGPLAN Notices, Volume 37 Issue 7 Publisher: ACM

Full text available: pdf(113.92 KB) Additional Information: full citation, abstract, references, cited by, index terms

VLIW processors are attractive for many embedded applications, but VLIW code scheduling, whether by hand or by compiler, is extremely challenging. In this paper, we extend previous work on automated verification of low-level software to handle the complexity ...

**Keywords**: DSP, VLIW, formal verification, symbolic execution, theory of equality with uninterpreted functions

### 19 Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP

Architecture

Elana Granston, Eric Stotzer, Joe Zbiciak

August 2001 **OM '01:** Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems

Publisher: ACM

Full text available: pdf(88.54 KB) Additional Information: full citation, abstract, references, index terms

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to *regular* ...

**Keywords**: VLIW architectures, WHILE loops, digital signal processors, software pipelining

#### 20 An FPGA-based VLIW processor with custom hardware execution

Alex K. Jones, Raymond Hoare, Dara Kusic, Joshua Fazekas, John Foster February 2005 FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

Publisher: ACM

Full text available: pdf(220.52 KB) Additional Information: full citation, abstract, references, cited by, index terms, review

The capability and heterogeneity of new FPGA (Field Programmable Gate Array) devices continues to increase with each new line of devices. Efficiently programming these devices is increasing in difficulty. However, FPGAs continue to be utilized for algorithms ...

Keywords: NIOS, VLIW, compiler, kernels, parallelism, synthesis

Results 1 - 20 of 305 Result page: 1 2 3 4 5 6 7 8 9 10 next >>

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

Terms of Usage Privacy Policy Code of Ethics Contact Us

Useful downloads: Adobe Acrobat Q QuickTime Windows Media Player Real Player



Subscribe (Full Service) Register (Limited Service, Free) Login

The ACM Digital Library

C The Guide

+vliw +cycle +parallel

iOillet

## THE ACM DICITAL LIBRARY

Feedback

+vliw +cycle +parallel

Terms used: vliw cycle parallel

Found 932 of 238,273

Sort results

by

relevance

Save results to a Binder

Refine these results with Advanced

Try this search in The ACM Guide

Display results

expanded form ▼

Copen results in a new window

Results 1 - 20 of 932

Result page: **1** <u>2</u> <u>3</u> <u>4</u> <u>5</u>

6

next >>

Compiler-directed thermal management for VLIW functional units Madhu Mutyam, Feihui Li, Vijaykrishnan Narayanan, Mahmut Kandemir,

Mary Jane Irwin

June 2006 LCTES '06: Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for

embedded systems

Publisher: ACM

Full text available: pdf(599.24 KB) Additional Information: full citation, abstract,

references, index terms

As processors, memories, and other components of today's embedded systems are pushed to higher performance in more enclosed spaces, processor thermal management is quickly becoming a limiting design factor. While previous proposals mostly approached ...

Keywords: IPC, VLIW, thermal

Ads by Google

**Document Scanning Service** Free Online Quote. Scan to PDF/TIF Serving the DC Metropolitan Area www.ignitedscanning.com

**GIS Image Segmentation** Shapefiles from satellite imagery Wizard to segment, classify, batch ImageSeg.com.

2 Flexible Compiler-Managed L0 Buffers for Clustered VLIW **Processors** 

Enric Gibert, Jesús Sánchez, Antonio González December 2003 MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

Publisher: IEEE Computer Society

Full text available: pdf(167.74 KB) Additional Information: full citation, abstract, cited by, index terms

Wire delays are a major concern for current and forthcoming processors. One approach to attack this problem is to divide the processorinto semi-independent units referred to as clusters. Acluster usually consists of a local register file and a subset ...

Storage Area **Networks** White Paper, SAN Solution Overview Free PDF

Download: Ciena www.ciena.com

Value prediction in VLIW machines

Tarun Nakra, Rajiv Gupta, Mary Lou Soffa May 1999 ISCA '99: ACM SIGARCH Computer Architecture News,

Volume 27 Issue 2

Publisher: ACM

Additional Information: full citation,

abstract, references, cited by,

**Models** This business tool delivers what everyone wants live on-demand www.liveunimodels.com

**Live Unified Corp** 

Full text available:



index terms

The performance of VLIW architectures is dependent on the capability of the compiler to detect and exploit instruction-level parallelism during instruction scheduling. To exploit the detected parallelism, instructions are reordered to reduce the length ...

4 Realization of a programmable parallel DSP for high performance

image processing applications

Jens Peter Wittenburg, Willm Hinrichs, Johannes Kneip, Martin Ohmacht, Mladen Bereković, Hanno Lieske, Helge Kloos, Peter Pirsch May 1998 **DAC '98:** Proceedings of the 35th annual conference on Design automation

Publisher: ACM

Full text available: pdf(2.35 MB) Additional Information: full citation, abstract, references, cited by, index terms

Architecture and design of the HiPAR-DSP, a SIMD controlled signalprocessor with parallel data paths, VLIW and novel memory design. The processor architecture is derived from an analysis of thetarget algorithms and specified in VHDL on register transfer ...

5 Parallel processing: a smart compiler and a dumb machine

Joseph A. Fisher, John R. Ellis, John C. Ruttenberg, Alexandru Nicolau April 2004 **ACM SIGPLAN Notices**, Volume 39 Issue 4 **Publisher:** ACM

Full text available: pdf(1.14 MB) Additional Information: full citation, abstract, references

Multiprocessors and vector machines, the only successful parallel architectures, have coarse-grained parallelism that is hard for compilers to take advantage of. We've developed a new fine-grained parallel architecture and a compiler that together offer ...

6 <u>Implicitly parallel programming models for thousand-core</u>

microprocessors

Wen-mei Hwu, Shane Ryoo, Sain-Zee Ueng, John H. Kelm, Isaac Gelado, Sam S. Stone, Robert E. Kidd, Sara S. Baghsorkhi, Aqeel A. Mahesri, Stephanie C. Tsao, Nacho Navarro, Steve S. Lumetta, Matthew I. Frank, Sanjay J. Patel

June 2007 **DAC '07:** Proceedings of the 44th annual conference on Design automation

Publisher: ACM

Full text available: pdf(600.21 KB) Additional Information: full citation, abstract, references, index terms

This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm-level parallelism, ...

**Keywords**: parallel programming

7

#### VLIW Architectures

Anup Gangwar, M. Balakrishnan, Preeti R. Panda, Anshul Kumar March 2005 DATE '05: Proceedings of the conference on Design, Automation and Test in Europe - Volume 2, Volume 2

Publisher: IEEE Computer Society

Full text available: 🔁 pdf(210.77 KB) Additional Information: full citation, abstract, cited by,

With new sophisticated compiler technology, it is possible to schedule distant instructions efficiently. As a consequence, the amount of exploitable instruction level parallelism (ILP) in applications has gone up considerably. However, monolithic register ...

### Efficient event-driven simulation of parallel processor architectures

Alexey Kupriyanov, Dmitrij Kissler, Frank Hannig, Jürgen Teich April 2007 SCOPES '07: Proceedingsof the 10th international workshop on Software & compilers for embedded systems

Publisher: ACM

Full text available: 🔁 pdf(619.62 KB) Additional Information: full citation, abstract,

In this paper we present a new approach for generating high-speed optimized event-driven instruction set level simulators for adaptive massively parallel processor architectures. The simulator generator is part of a methodology for the systematic mapping, ...

**Keywords**: embedded tools, modeling, processor arrays, simulation

### Instruction scheduling for clustered VLIW architectures

Jesús Sánchez, Antonio González

September 2000 ISSS '00: Proceedings of the 13th international symposium

on System synthesis

Publisher: IEEE Computer Society

Full text available: pdf(63.06 KB) Additional Information: full citation, abstract, references, cited by

Clustered VLIW organizations are nowadays a common trend in the design of embedded/DSP processors. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction ...

### 10 Automatic formal verification for scheduled VLIW code

Xiushan Feng, Alan J. Hu

June 2002 LCTES/SCOPES '02: Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(113.92 KB) references, cited by, index

VLIW processors are attractive for many embedded applications, but VLIW code scheduling, whether by hand or by compiler, is extremely challenging. In this paper, we extend previous work on automated verification of low-level software to handle the complexity ...

Keywords: DSP, VLIW, formal verification, symbolic execution, theory of equality with uninterpreted functions

### 11 Value prediction in VLIW machines

Tarun Nakra, Rajiv Gupta, Mary Lou Soffa

May 1999 ISCA '99: Proceedings of the 26th annual international

symposium on Computer architecture

Publisher: IEEE Computer Society

Additional Information: full citation,

Full text available: pdf(226.09 KB) Publisher Site

abstract, references, cited by, index terms

The performance of VLIW architectures is dependent on the capability of the compiler to detect and exploit instruction-level parallelism during instruction scheduling. To exploit the detected parallelism, instructions are reordered to reduce the length ...

### 12 A fast parallel reed-solomon decoder on a reconfigurable architecture

Arezou Koohi, Nader Bagherzadeh, Chengzi Pan

October 2003 CODES+ISSS '03: Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(292.18 KB)

references, cited by, index

This paper presents a software implementation of a very fast parallel Reed-Solomon decoder on the second generation of MorphoSys reconfigurable computation platform, which is targeting on streamed applications such as multimedia and DSP. Numerous modifications ...

Keywords: Berlekamp algorithm, Chein search, Reed-Solomon codes, SIMD processor, reconfigurable architecture

### 13 Interactive presentation: Time-constrained clustering for DSE of clustered VLIW-ASP

Mario Schölzel

April 2007 **DATE '07:** Proceedings of the conference on Design, automation

and test in Europe

Publisher: EDA Consortium

Full text available: pdf(277.07 KB) Additional Information: full citation, abstract,

references

In this paper we describe a new time-constrained clustering algorithm. It is coupled with a time-constrained scheduling algorithm and used for Design-Space-Exploration (DSE) of clustered VLIW processors with heterogeneous clusters and heterogeneous functional ...

### 14 A scalable wide-issue clustered VLIW with a reconfigurable

interconnect

Osvaldo Colavin, Davide Rizzo

October 2003 CASES '03: Proceedings of the 2003 international conference

on Compilers, architecture and synthesis for embedded systems

Publisher: ACM

Full text available: pdf(365.26 KB) Additional Information: full citation, abstract, references, index terms

Clustered VLIW architectures have been widely adopted in modern embedded multimedia applications for their ability to exploit high degrees of ILP with reasonable trade-off in complexity and silicon costs. Studies have however shown limited performance ...

**Keywords**: IDCT, clustered VLIW, modulo scheduling, reconfigurable co-processor (RCP)

### 15 Branch prediction techniques for low-power VLIW processors

G. Palermo, M. Sam, C. Silvan, V. Zaccari, R. Zafalo
April 2003 **GLSVLSI** '03: Proceedings of the 13th ACM Great Lakes symposium on VLSI

Publisher: ACM

Additional Information: <u>full citation</u>, <u>abstract</u>,
Full text available: pdf(178.89 KB)

references, cited by, index
terms

Main goal of the paper is to introduce a branch prediction scheme suitable for energy-efficient VLIW (Very Long Instruction Word) processors aiming at reducing the energy associated with the prediction phase by filtering the accesses to the branch predictor ...

Keywords: VLIW processors, branch prediction, low-power design

Parallel processor scheduling with delay constraints
Daniel W. Engels, Jon Feldman, David R. Karger, Matthias Ruhl
January 2001 SODA '01: Proceedings of the twelfth annual ACM-SIAM
symposium on Discrete algorithms

Publisher: Society for Industrial and Applied Mathematics

Full text available: pdf(729.37 KB) Additional Information: full citation, abstract, references, index terms

We consider the problem of scheduling unit-length jobs on identical parallel machines such that the makespan of the resulting schedule is minimized. Precedence constraints impose a partial order on the jobs, and both communication and precedence delays ...

### 17 <u>Loop fusion for clustered VLIW architectures</u>

Yi Qian, Steve Carr, Philip Sweany
July 2002 LCTES/SCOPES '02: ACM SIGPLAN Notices, Volume 37 Issue 7
Publisher: ACM

Additional Information: <u>full citation</u>, <u>abstract</u>,
Full text available: pdf(111.58 KB)

references, cited by, index terms

Embedded systems require maximum performance from a processor within significant constraints in power consumption and chip cost. Using software pipelining, high-performance digital signal processors can often exploit considerable instruction-level parallelism ...

Keywords: clustered VLIW architectures, loop fusion

#### 18 A low-cost mixed-mode parallel processor architecture for embedded

systems

Shorin Kyo, Takuya Koga, Lieske Hanno, Shouhei Nomoto, Shin'ichiro Okazaki

June 2007 ICS '07: Proceedings of the 21st annual international conference on Supercomputing

Publisher: ACM

Full text available: pdf(516.23 KB) Additional Information: full citation, abstract, references, index terms

A scalable SIMD/MIMD mixed-mode parallel processor architecture called XC core is proposed to meet the high and diverse performance requirements of embedded multimedia applications. XC core supports both the SIMD and MIMD computing models at low hardware ...

**Keywords**: MIMD, SIMD, embedded systems, mixed-mode, multimedia processing, parallel architectures, tile architectures

### 19 A new register file access architecture for software pipelining in VLIW



processors

Yanjun Zhang, Hu he, Yihe Sun

January 2005 ASP-DAC '05: Proceedings of the 2005 conference on Asia South Pacific design automation

Publisher: ACM

Full text available: pdf(305.78 KB) Additional Information: full citation, abstract,

This paper presents a novel architecture of register files that combines the local register files and the global register file for clustered VLIW (Very Long Instruction Word) processors. The communication between function units through global register ...

### 20 VLIW compilation techniques in a superscalar environment

Kemal Ebcioglu, Randy D. Groves, Ki-Chang Kim, Gabriel M. Silberman,

August 1994 PLDI '94: Proceedings of the ACM SIGPLAN 1994 conference

on Programming language design and implementation

Publisher: ACM

Additional Information: full citation, abstract, references, Full text available: 7 pdf(1.30 MB) cited by, index terms

We describe techniques for converting the intermediate code representation of a given program, as generated by a modern compiler, to another representation which produces the same run-time results, but can run faster on a superscalar machine. The algorithms, ...

Keywords: VLIW, compiler optimizations, global scheduling, profiling directed feedback, software pipelining, superscalars

Results 1 - 20 of 932 Result page: 1 2 3 4 5 6 7 8 9 10

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc. Terms of Usage Privacy Policy Code of Ethics Contact Us

Useful downloads: Adobe Acrobat QuickTime

Windows Media Player





Subscribe (Full Service) Register (Limited Service, Free) Login

• The ACM Digital Library C The Guide

+vliw +binary +compatible

SECTION 1

#### THE ACM DIGITAL LIBRARY

Feedback

+vliw +binary +compatible

Terms used: vliw binary compatible

Found 89 of 238,273

Sort results by relevance Display results expanded form

Save results to a Binder

Copen results in a new window

Refine these results with Advanced

Search

Try this search in The ACM Guide

Results 1 - 20 of 89

Result page:  $1 \quad \underline{2} \quad \underline{3} \quad \underline{4} \quad \underline{5}$ <u>next</u> >>

<u>Dynamic parallelization and mapping of binary executables on hierarchical</u>

Ads by Google

platforms

Efe Yardimci, Michael Franz

May 2006 CF '06: Proceedings of the 3rd conference on Computing frontiers

Publisher: ACM

Full text available: pdf(241.09 KB) Additional Information: full citation, abstract, references, index terms

As performance improvements are being increasingly sought via coarsegrained parallelism, established expectations of continued sequential performance increases are not being met. Current trends in computing point towards platforms seeking performance ...

**Keywords**: continuous optimization, dynamic parallelization

**Document** Scanning Service Free Online Quote. Scan to PDF/TIF Serving the DC Metropolitan Area www.ignitedscanning.com

GIS Image Segmentation Shapefiles from satellite imagery Wizard to segment, classify, batch ImageSeg.com

2 Dynamic reconfiguration with binary translation: breaking the ILP barrier

with software compatibility

Antonio Carlos S. Beck, Luigi Carro

June 2005 DAC '05: Proceedings of the 42nd annual conference on Design

automation

Publisher: ACM

Full text available: pdf(811.10 KB) Additional Information: full citation, abstract, references, cited by, index terms

In this paper we present the impact of dynamically translating any sequence of instructions into combinational logic. The proposed approach combines a ...

Keywords: binary translation, java, power consumption, reconfigurable processors

Storage Area **Networks** White Paper, SAN Solution Overview

Free PDF Download: Ciena www.ciena.com '

An FPGA-based VLIW processor with custom hardware execution

Alex K. Jones, Raymond Hoare, Dara Kusic, Joshua Fazekas, John Foster February 2005 FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

Publisher: ACM

Full text available: pdf(220.52 KB) Additional Information: full citation, abstract, references, cited by, index terms, review

The capability and heterogeneity of new FPGA (Field Programmable Gate Array) devices continues to increase with each new line of devices. Efficiently programming these devices is increasing in difficulty. However, FPGAs continue to be utilized for algorithms ...

Live Unified Corp Models This business tool delivers what everyone wants live

on-demand www.liveunimodels.com Keywords: NIOS, VLIW, compiler, kernels, parallelism, synthesis

4 Reducing power while increasing performance with supercisc

Alex K. Jones, Raymond Hoare, Dara Kusic, Gayatri Mehta, Josh Fazekas, John Foster

August 2006 ACM Transactions on Embedded Computing Systems (TECS),
Volume 5 Issue 3

Publisher: ACM

Full text available: pdf(675.92 KB) Additional Information: full citation, abstract, references, cited by, index terms

Multiprocessor Systems on Chips (MPSoCs) have become a popular architectural technique to increase performance. However, MPSoCs may lead to undesirable power consumption characteristics for computing systems that have strict power budgets, such as PDAs, ...

Keywords: Low-power, VLIW, multicore architectures, predication, synthesis

5 Constructing and exploiting linear schedules with prescribed parallelism

Alain Darte, Robert Schreiber, B. Ramakrishna Rau, Frédéric Vivien
January 2002 ACM Transactions on Design Automation of Electronic
Systems (TODAES), Volume 7 Issue 1

Publisher: ACM

Full text available: pdf(159.04 KB) Additional Information: full citation, abstract, references, cited by, index terms

We present two new results of importance in code generation for and synthesis of synchronously scheduled parallel processor arrays and multicluster VLIWs. The first is a new practical method for constructing a linear schedule for the iterations of a ...

**Keywords**: Linear schedule, multicluster VLIW, systolic array

6 DAISY: dynamic compilation for 100% architectural compatibility

Kemal Ebcioğlu, Erik R. Altman

May 1997 ISCA '97: ACM SIGARCH Computer Architecture News, Volume 25 Issue 2

Publisher: ACM

Full text available: pdf(1.97 MB) Additional Information: full citation, abstract, references, cited by, index terms

Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine ...

**Keywords**: binary translation, dynamic compilation, instruction-level parallelism, object code compatible VLIW, superscalar

7 A comparison of software and hardware techniques for x86 virtualization

Keith Adams, Ole Agesen

October 2006 ASPLOS-XII: ACM SIGARCH Computer Architecture News, Volume 34 Issue 5

Publisher: ACM

Full text available: pdf(236.66 KB) Additional Information: full citation, abstract, references, index terms

Until recently, the x86 architecture has not permitted classical trap-andemulate virtualization. Virtual Machine Monitors for x86, such as VMware ® Workstation and Virtual PC, have instead used binary translation of the quest kernel code. However, ...

**Keywords**: MMU, SVM, TLB, VT, dynamic binary translation, nested paging, virtual machine monitor, virtualization, x86

A comparison of software and hardware techniques for x86 virtualization

Keith Adams, Ole Agesen

October 2006 ASPLOS-XII: ACM SIGOPS Operating Systems Review, Volume 40 Issue 5

Publisher: ACM

Full text available: pdf(236.66 KB) Additional Information: full citation, abstract, references,

Until recently, the x86 architecture has not permitted classical trap-andemulate virtualization. Virtual Machine Monitors for x86, such as VMware ® Workstation and Virtual PC, have instead used binary translation of the guest kernel code. However, ...

Keywords: MMU, SVM, TLB, VT, dynamic binary translation, nested paging, virtual machine monitor, virtualization, x86

Incremental Commit Groups for Non-Atomic Trace Processing

Matt T. Yourst, Kanad Ghose

November 2005 MICRO 38: Proceedings of the 38th annual IEEE/ACM

International Symposium on Microarchitecture

Publisher: IEEE Computer Society

Additional Information: full citation,

Full text available: pdf(614.37 KB) Publisher Site

abstract, references, index terms.

We introduce techniques to support efficient non-atomic execution of very long traces on a new binary translation based, x86-64 compatible VLIW microprocessor. Incrementally committed long traces significantly reduce wasted computations on exception ...

Keywords: binary translation, VLIW, commitment, trace prediction

10 Prematerialization: reducing register pressure for free

Ivan D. Baev, Richard E. Hank, David H. Gross

September 2006 PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques

Publisher: ACM

Full text available: pdf(159.54 KB) Additional Information: full citation, abstract, references,

Modern compiler transformations that eliminate redundant computations or reorder instructions, such as partial redundancy elimination and instruction scheduling, are very effective in improving application performance but tend to create longer and potentially ...

Keywords: Itanium, VLIW, register allocation, register pressure, rematerialization

11 DAISY: dynamic compilation for 100% architectural compatibility

Kemal Ebcioğlu, Erik R. Altman

June 1997 **ISCA '97:** Proceedings of the 24th annual international symposium on Computer architecture

Publisher: ACM

Full text available: pdf(1.97 MB) Additional Information: full citation, abstract, references, cited by, index terms

Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine ...

**Keywords**: binary translation, dynamic compilation, instruction-level parallelism, object code compatible VLIW, superscalar

12 A comparison of software and hardware techniques for x86 virtualization

Keith Adams, Ole Agesen

October 2006 **ASPLOS-XII:** Proceedings of the 12th international conference on Architectural support for programming languages and operating systems

Publisher: ACM

Full text available: pdf(236.66 KB) Additional Information: full citation, abstract, references, index terms

**Keywords**: MMU, SVM, TLB, VT, dynamic binary translation, nested paging, virtual machine monitor, virtualization, x86

13 A comparison of software and hardware techniques for x86 virtualization

Keith Adams, Ole Agesen

November 2006 ASPLOS-XII: ACM SIGPLAN Notices, Volume 41 Issue 11 Publisher: ACM

Full text available: pdf(236.66 KB) Additional Information: full citation, abstract, references, index terms

Until recently, the x86 architecture has not permitted classical trap-and-emulate virtualization. Virtual Machine Monitors for x86, such as VMware ® Workstation and Virtual PC, have instead used binary translation of the guest kernel code. However, ...

**Keywords**: MMU, SVM, TLB, VT, dynamic binary translation, nested paging, virtual machine monitor, virtualization, x86

14 Customized instruction-sets for embedded processors

Joseph A. Fisher

June 1999 **DAC '99:** Proceedings of the 36th ACM/IEEE conference on Design automation

Publisher: ACM

Additional Information: full citation, appendices and

Full text available: pdf(55.91 KB) supplements, references, cited by,

index terms

Keywords: VLIW, custom processors, embedded processors, instruction-level parallelism, mass customization of toolchains

### 15 Static strands: safely collapsing dependence chains for increasing

embedded power efficiency

Peter G. Sassone, D. Scott Wills, Gabriel H. Loh

July 2005 LCTES '05: ACM SIGPLAN Notices, Volume 40 Issue 7

Full text available: pdf(274.43 KB) Additional Information: full citation, abstract, references, cited by, index terms

Modern embedded processors are designed to maximize execution efficiency-the amount of performance achieved per unit of energy dissipated while meeting minimum performance levels. To increase this efficiency we propose utilizing static strands, ...

**Keywords**: architecture, dependency collapsing, embedded, energy, sequentiality

### 16 Static strands: safely collapsing dependence chains for increasing

embedded power efficiency

Peter G. Sassone, D. Scott Wills, Gabriel H. Loh

June 2005 LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference

on Languages, compilers, and tools for embedded systems

Publisher: ACM

Full text available: pdf(274.43 KB) Additional Information: full citation, abstract, references,

cited by, index terms

Modern embedded processors are designed to maximize execution efficiencythe amount of performance achieved per unit of energy dissipated while meeting minimum performance levels. To increase this efficiency we propose utilizing static strands, ...

**Keywords**: architecture, dependency collapsing, embedded, energy, sequentiality

### 17 Computation of minimal counterexamples by using black box techniques and symbolic methods

Tobias Nopper, Christoph Scholl, Bernd Becker

November 2007 ICCAD '07: Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design

Publisher: IEEE Press

Full text available: 📆 pdf(299.94 KB) Additional Information: full citation, abstract, references

Computing counterexamples is a crucial task for error diagnosis and debugging of sequential systems. If an implementation does not fulfill its specification, counterexamples are used to explain the error effect to the designer. In order to be understood ...

### 18 Virtual Hardware Byte Code as a Design Platform for Reconfigurable **Embedded Systems**

Sebastian Lange, Udo Kebschull

March 2003 DATE '03: Proceedings of the conference on Design,

Automation and Test in Europe - Volume 1, Volume 1

Publisher: IEEE Computer Society

Full text available: Publisher Site

Additional Information: full citation,

abstract, index terms

Reconfigurable hardware will be used in many future embedded applications. Since most of these embedded systems will be temporarily or permanently connected to a network, the possibility to reload parts of the application at run time arises. In the 90ies ...

Keywords: Virtual Hardware Machine, Byte Code, FPGA

### 19 Instruction encoding synthesis for architecture exploration using

hierarchical processor models

Achim Nohl, Volker Greive, Gunnar Braun, Andreas Andreas, Rainer Leupers, Oliver Schliebusch, Heinrich Meyr

June 2003 **DAC '03:** Proceedings of the 40th conference on Design automation **Publisher:** ACM

Full text available: pdf(533.22 KB) Additional Information: full citation, abstract, references, cited by, index terms

This paper presents a novel instruction encoding generation technique for use in architecture exploration for application specific processors. The underlying exploration methodology is based on successive processor model refinement combined with simulation ...

Keywords: instruction encoding, instruction set architectures

# 20 An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, Krisztian Flautner

June 2005 ISCA '05: Proceedings of the 32nd annual international symposium on

Computer Architecture

Publisher: IEEE Computer Society

Full text available: pdf(379.01 KB) Additional Information: full citation, abstract, cited by, index terms

Instruction set customization is an effective way to improve processor performance. Critical portions of applicationdata-flow graphs are collapsed for accelerated execution on specialized hardware. Collapsing dataflow subgraphs will compress the latency ...

Results 1 - 20 of 89 Result page:  $\mathbf{1}$  2 3 4 5 next  $\geq$ 

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

<u>Terms of Usage Privacy Policy Code of Ethics Contact Us</u>

Useful downloads: Adobe Acrobat Q QuickTime Windows Media Player Real Player



Home | Login | Logout | Access Information | Alerts | Purchase History |

ory | Cart | Sitemap

#### Welcome United States Patent and Trademark Office

Search Results

**BROWSE** 

**SEARCH** 

**IEEE XPLORE GUIDE** 

SUPPORT

⊠e-mail 🖶 printer

Results for "((processor and binary and compatible)<in>metadata)"

Your search matched 14 of 1733971 documents.

A maximum of 100 results are displayed, 25 to a page, sorted by Relevance in Descending order.



» Search Options

View Session History

New Search

» Key

IEEE JNL IEEE Journal or

Magazine

IET JNL

IET Journal or Magazine

IEEE CNF IEEE Conference

Proceeding

IET CNF IET

IET Conference

Proceeding

IEEE STD IEEE Standard

Modify Search

((processor and binary and compatible)<in>metadata)

Check to search only within this results set

IEEE/IET

Display Format:

**Books** 

© Citation C Citation & Abstract

**Educational Courses** 

Application Notes [

IEEE/IET journals, transactions, letters, magazines, conference proceedings, and standards.

d view selected items

Select All Deselect All

1. Instruction set selection for ASIP design

Gschwind, M.;

Hardware/Software Codesign, 1999. (CODES '99) Proceedings of the Seventh International World

on

3-5 May 1999 Page(s):7 - 11

Digital Object Identifier 10.1109/HSC.1999.777382

AbstractPlus | Full Text: PDF(408 KB) IEEE CNF

Rights and Permissions

2. Processor design and implementation for real-time testing of embedded systems

Walters, G.; King, E.; Kessinger, R.; Fryer, R.;

Digital Avionics Systems Conference, 1998. Proceedings., 17th DASC. The AIAA/IEEE/SAE

Volume 1, 31 Oct.-7 Nov. 1998 Page(s):B44/1 - B44/8 vol.1

Digital Object Identifier 10.1109/DASC.1998.741470

AbstractPlus | Full Text: PDF(668 KB) IEEE CNF

Rights and Permissions

3. Development of a well-structured industrial vision system

Bien, Z.; Oh, S.-R.; Won, J.; You, B.-J.; Han, D.; Kim, J.O.;

Industrial Electronics Society, 1990. IECON '90., 16th Annual Conference of IEEE

27-30 Nov. 1990 Page(s):501 - 506 vol.1

Digital Object Identifier 10.1109/IECON.1990.149191

AbstractPlus | Full Text: PDF(540 KB) IEEE CNF

Rights and Permissions

4. Toward a processor core for real-time capable autonomic systems

Uhrig, S.; Maier, S.; Ungerer, T.;

Signal Processing and Information Technology, 2005. Proceedings of the Fifth IEEE International

Symposium on

18-21 Dec. 2005 Page(s):19 - 22

Digital Object Identifier 10.1109/ISSPIT.2005.1577063

AbstractPlus | Full Text: PDF(226 KB) | IEEE CNF

Rights and Permissions

5. The reconfigurable streaming vector processor (RSVP/spl trade/)

Ciricescu, S.; Essick, R.; Lucas, B.; May, P.; Moat, K.; Norris, J.; Schuette, M.; Saidi, A.; Microarchitecture, 2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Symposium

2003 Page(s):141 - 150

Digital Object Identifier 10.1109/MICRO.2003.1253190

AbstractPlus | Full Text: PDF(430 KB) IEEE CNF

Rights and Permissions

#### 6. Implicit manipulation of equivalence classes using binary decision diagrams

Lin, B.; Newton, A.R.;

Computer Design: VLSI in Computers and Processors, 1991. ICCD '91. Proceedings., 1991 IEEE International Conference on

14-16 Oct. 1991 Page(s):81 - 85

Digital Object Identifier 10.1109/ICCD.1991.139995

AbstractPlus | Full Text: PDF(508 KB) IEEE CNF

Rights and Permissions

#### 7. Circuit techniques in a 266-MHz MMX-enabled processor

Draper, D.; Crowley, M.; Holst, J.; Favor, G.; Schoy, A.; Trull, J.; Ben-Meir, A.; Khanna, R.; Wend Krishna, R.; Nolan, J.; Mallick, D.; Partovi, H.; Roberts, M.; Johnson, M.; Lee, T.;

Solid-State Circuits, IEEE Journal of

Volume 32, Issue 11, Nov. 1997 Page(s):1650 - 1664

Digital Object Identifier 10.1109/4.641685

AbstractPlus | References | Full Text: PDF(328 KB) | IEEE JNL

Rights and Permissions

#### 8. The HP PA-8000 RISC CPU

Kumar, A.;

Micro, IEEE

Volume 17, <u>Issue 2</u>, March-April 1997 Page(s):27 - 32

Digital Object Identifier 10.1109/40.592310

AbstractPlus | Full Text: PDF(128 KB) IEEE JNL

Rights and Permissions

#### 9. The Motorola DSP 96002 IEEE floating-point digital signal processor

Kloker, K.L.; Lindsley, B.; Liberman, S.; Marino, P.; Rushinek, E.; Hillman, G.D.;

Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on

23-26 May 1989 Page(s):2480 - 2483 vol.4

Digital Object Identifier 10.1109/ICASSP.1989.266970

AbstractPlus | Full Text: PDF(320 KB) | IEEE CNF

Rights and Permissions

#### 10. The i486 CPU: executing instructions in one clock cycle

Crawford, J.H.;

Micro, IEEE

Volume 10, Issue 1, Feb. 1990 Page(s):27 - 36

Digital Object Identifier 10:1109/40.46766

AbstractPlus | Full Text: PDF(848 KB) | IEEE JNL

Rights and Permissions

#### 11. Incremental commit groups for non-atomic trace processing

Yourst, M.T.; Ghose, K.;

Microarchitecture, 2005. MICRO-38. Proceedings. 38th Annual IEEE/ACM International Symposi

12-16 Nov. 2005 Page(s):12 pp.

Digital Object Identifier 10.1109/MICRO.2005.23

AbstractPlus | Full Text: PDF(608 KB) IEEE CNF

Rights and Permissions

# 12. Design and implementation of a high-performance and silicon efficient arithmetic coding accelerator for the H.264 advanced video codec

Nunez-Yanez, J.L.; Chouliaras, V.A.;

Application-Specific Systems, Architecture Processors, 2005. ASAP 2005. 16th IEEE International Conference on

23-25 July 2005 Page(s):411 - 416

Digital Object Identifier 10.1109/ASAP.2005.30

AbstractPlus | Full Text: PDF(216 KB) | IEEE CNF

Rights and Permissions

13. The Scalable Coherent Interface: scaling to high-performance systems
James, D.V.;
Compcon Spring '94, Digest of Papers.
28 Feb.-4 March 1994 Page(s):64 - 71
Digital Object Identifier 10.1109/CMPCON.1994.282943

AbstractPlus | Full Text: PDF(556 KB) | IEEE CNF Rights and Permissions

14. MSI High-Speed Low-Power GaAs Integrated Circuits Using Schottky Diode FET Logic Long, S.I.; Lee, F.S.; Zucca, R.; Welch, B.M.; Eden, R.C.;

Microwave Theory and Techniques, IEEE Transactions on Volume 28, Issue 5, May 1980 Page(s):466 - 472

<u>AbstractPlus</u> | Full Text: <u>PDF</u>(1056 KB) IEEE JNL <u>Rights and Permissions</u>

Indexed by

Help Contact Us Privacy & Security I

© Copyright 2007 IEEE – All Rights I

Subscribe (Full Service) Register (Limited Service, Free) Login

Search: • The ACM Digital Library • C The Guid

processor +isa

REFIRE

### THE ACM DIGITAL LIBRARY

Feedback

processor +isa

Terms used: processor isa

Found 1,385 of 238,

Sort results by relevance 

Display results expanded form 

✓

Save results to a Binder

Refine these results with <u>Advanced Searc</u> Try this search in The ACM Guide

Copen results in a new window

Results 1 - 20 of 1,385

Result page: 1  $\frac{2}{3}$   $\frac{3}{4}$   $\frac{4}{5}$   $\frac{5}{6}$   $\frac{7}{7}$   $\frac{8}{9}$   $\frac{9}{10}$   $\frac{10}{\text{next}}$   $\Rightarrow$ 

Overcoming the limitations of conventional vector processors

Ads by Google

Christos Kozyrakis, David Patterson

May 2003 ISCA '03: ACM SIGARCH Computer Architecture News, Volume 31 Issue 2 Publisher: ACM

Full text available: pdf(160.23 KB) Additional Information: full citation, abstract, references, cited by

Despite their superior performance for multimedia applications, vector processors have three limitations that hinder their widespread acceptance. First, the complexity and size of the centralized vector register file limits the number of functional units. ...

Document

Scanning Service
Free Online Quo
Scan to PDF/TIF
Serving the DC
Metropolitan Area
www.ignitedscanning.

<sup>2</sup> A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable Processor Design

Fayez Mohamood, Michael B. Healy, Sung Kyu Lim, Hsien-Hsin S. Lee
December 2006 MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium
on Microarchitecture

Publisher: IEEE Computer Society

Full text available: pdf(428.63 KB) Additional Information: full citation, abstract, index terms

Power delivery is a growing reliability concern in micropro- cessors as the industry moves toward feature-rich, power- hungrier designs. To battle the ever-aggravating power consumption, modern microprocessor designers or researchers propose and apply ...

GIS Image Segmentation Shapefiles from satellite imagery Wizard to segme classify, batch ImageSeg.com

Storage Area
Networks
Free Ciena White
Paper [pdf] SAN
Integration &
Network Solution
www.ciena.com

3 A study of slipstream processors

Zach Purser, Karthik Sundaramoorthy, Eric Rotenberg

December 2000 MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture

Publisher: ACM

Full text available: pdf(130.26 KB) ps(398.01 KB) Publisher Site

Additional Information: <u>full citation</u>, <u>references</u>, <u>cited</u>

by, index terms

Live Unified Con Models
This business too delivers what everyone wants I on-demand www.liveunimodels.co

A scalable low power issue queue for large instruction window processors

Rajesh Vivekanandham, Bharadwaj Amrutur, R. Govindarajan

June 2006 ICS '06: Proceedings of the 20th annual international conference on

Supercomputing

Publisher: ACM

Full text available: pdf(1.18 MB)

Additional Information: full citation, abstract, references, index terms

Large instruction windows and issue queues are key to exploiting greater instruction level parallelism in out-of-order superscalar processors. However, the cycle time and energy consumption of conventional large monolithic issue queues are high. Previous ...

**Keywords**: complexity-effective architecture, issue logic, low-power architecture, wakeup logic

5 Reducing off-chip memory access costs using data recomputation in embedded chip

multi-processors

Hakduran Koc, Mahmut Kandemir, Ehat Ercanli, Ozcan Ozturk

June 2007 **DAC '07:** Proceedings of the 44th annual conference on Design automation

Publisher: ACM

Full text available: pdf(358.78 KB) Additional Information: full citation, abstract, references, index terms

There have been numerous efforts on Scratch-Pad Memory (SPM) management in the context of single CPU systems and, more recently, multi-processor architectures. This paper presents a novel SPM space utilization strategy, for embedded chip multi-processor ...

**Keywords**: CMP, data recomputation, scratch-pad memory

6 Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance

Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, Keith I.

June 2004 **ISCA '04:** Proceedings of the 31st annual international symposium on Computer architecture

Publisher: IEEE Computer Society

Full text available: 🔁 pdf(223.32 KB) Additional Information: full citation, abstract, references, cited by

A single-ISA heterogeneous multi-core architecture is achip multiprocessor composed of cores of varying size, performance, and complexity. This paper demonstrates that this architecture can provide significantly higher performance in the same area than ...

7 Performance of image and video processing with general-purpose processors and media ISA extensions

Parthasarathy Ranganathan, Sarita Adve, Norman P. Jouppi

May 1999 **ISCA '99:** Proceedings of the 26th annual international symposium on Computer architecture

Publisher: IEEE Computer Society

Full text available: Additional Information: full citation, abstract, references, cited by, index terms

This paper aims to provide a quantitative understanding of the performance of image and video processing applications on general-purpose processors, without and with media ISA extensions. We use detailed simulation of 12 benchmarks to study the effectiveness ...

8 <u>High-performance packet classification algorithm for many-core and multithreaded</u>

network processor

Duo Liu, Bei Hua, Xianghui Hu, Xinan Tang

October 2006 **CASES '06:** Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems

Publisher: ACM

Full text available: pdf(1.14 MB)

Additional Information: full citation, abstract, references, index terms

Packet classification is crucial for the Internet to provide more value-added services and guaranteed quality of service. Besides hardware-based solutions, many software-based classification algorithms have been proposed. However, classifying at 10Gbps ...

**Keywords**: architecture, embedded system design, multithreading, network processor, packet classification, thread-level parallelism

Energy Optimization of Subthreshold-Voltage Sensor Network Processors

Leyla Nazhandali, Bo Zhai, Javin Olson, Anna Reeves, Michael Minuth, Ryan Helfand, Sanjay Pant, Todd Austin, David Blaauw

May 2005 ISCA '05: ACM SIGARCH Computer Architecture News, Volume 33 Issue 2 Publisher: ACM

Full text available: pdf(269.50 KB) Additional Information: full citation, abstract, cited by, index terms

Sensor network processors and their applications are a growing area of focus in computer system research and design. Inherent to this design space is a reduced processing performance requirement and extremely high energy constraints, such that sensor ...

10 Parallel programming models for a multi-processor SoC platform applied to high-speed

traffic management

Pierre G. Paulin, Chuck Pilkington, Michel Langevin, Essaid Bensoudane, Gabriela Nicolescu September 2004 CODES+ISSS '04: Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Publisher: ACM

Additional Information: full citation, abstract, references, cited by, index Full text available: pdf(238.83 KB) terms

In this paper, we describe the MultiFlex multi-processor SoC programming environment, with focus on two programming models: a distributed system object component (DSOC) message passing model, and a symmetrical multi-processing (SMP) model using shared ...

Keywords: embedded software, multi-processor systems, system-on-chip

11 Timing analysis of embedded software for speculative processors

Tulika Mitra, Abhik Roychoudhury, Xianfeng Li

October 2002 ISSS '02: Proceedings of the 15th international symposium on System Synthesis Publisher: ACM

Additional Information: full citation, abstract, references, cited by, index Full text available: pdf(200.79 KB)

Static timing analysis of embedded software is important for systems with hard real-time constraints. To accurately estimate time bounds, it is essential to model the underlying micro architecture. In this paper, we study static timing analysis of embedded ...

**Keywords**: branch prediction, worst case execution time

12 A code compression advisory tool for embedded processors

Sreejith K Menon, Priti Shankar

March 2005 SAC '05: Proceedings of the 2005 ACM symposium on Applied computing Publisher: ACM

Additional Information: full citation, abstract, references, index terms Full text available: pdf(158.32 KB)

We present a tool which is designed to be used as a code compression advisory system for object code to be run on an embedded processor. All the compression schemes support runtime random decompression. Given the machine instruction set architecture, ...

Keywords: code compression, embedded system tool, run time decompression

#### 13 Decoupling local variable accesses in a wide-issue superscalar processor

Sangyeun Cho, Pen-Chung Yew, Gyungho Lee

May 1999 ISCA '99: Proceedings of the 26th annual international symposium on Computer

architecture

Publisher: IEEE Computer Society

Full text available: pdf(670.92 KB) Publisher Site

Additional Information: full citation, abstract, references. cited by, index terms

Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data cache, however, becomes increasingly inefficient and can add to the hardware ...

#### 14 Evaluate the performance changes of processor simulator benchmarks When context

switches are incorporated Rajaa S. Shindi, Shaun Cooper

November 2006 SIGAda '06: ACM SIGAda Ada Letters, Volume XXVI Issue 3

Publisher: ACM

Full text available: pdf(267.85 KB) Additional Information: full citation, abstract, references, index terms

Building state-of-the-art processors is expensive and time consuming. Once the design is finalized and implemented, simulations are used to evaluate functionality and performance of the system. The Sim-alpha processor simulator is one of the most important ...

**Keywords**: cache, context switches, cpu, processor simulators, sim-alpha

## 15 Processor Acceleration Through Automated Instruction Set Customization

Nathan Clark, Hongtao Zhong, Scott Mahlke

December 2003 MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

Publisher: IEEE Computer Society

Full text available: pdf(342.27 KB) Additional Information: <u>full citation</u>, <u>abstract</u>, <u>cited by</u>, <u>index terms</u>

Application-specific extensions to the computational capabilities of a processor provide an efficient mechanism to meetthe growing performance and power demands of embeddedapplications. Hardware, in the form of new function units(or co-processors), and ...

## 16 Power efficient comparators for long arguments in superscalar processors

Dmitry Ponomarev, Gurhan Kucuk, Oguz Ergin, Kanad Ghose

August 2003 ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design

Publisher: ACM

Additional Information: full citation, abstract, references, index terms Full text available: pdf(69.80 KB)

Traditional pulldown comparators that are used to implement associative addressing logic in superscalar microprocessors dissipate energy on a mismatch in any bit position in the comparands. As mismatches occur much more frequently than matches in many ...

**Keywords**: low-power comparators, superscalar datapath



# A constraint-based solution for on-line testing of processors embedded in real-time applications

Marcelo Moraes, Érika Cota, Luigi Carro, Flávio Wagner, Marcelo Lubaszewski September 2005 **SBCCI '05:** Proceedings of the 18th annual symposium on Integrated circuits and system design

Publisher: ACM

Full text available: pdf(133.86 KB) Additional Information: full citation, abstract, references, index terms

Software-based self-test has been proposed as a low-cost strategy for on-line periodic testing of embedded processors. In this paper, we show that structural test programs composed only by regular deterministic self-test routines may be unfeasible in ...

**Keywords**: embedded processors, on-line testing, real-time systems, software-based self-test, test space exploration

### <sup>18</sup> A scalable, clustered SMT processor for digital signal processing

Mladen Berekovic, Sören Moch, Peter Pirsch

September 2003 **MEDEA '03:** Proceedings of the 2003 workshop on MEmory performance:

DEaling with Applications , systems and architecture

Publisher: ACM

Full text available: pdf(356.32 KB) Additional Information: full citation, abstract, references

A scalable, distributed, processor architecture is presented that emphasizes on high performance computing for digital signal processing applications by combining high frequency design techniques with a very high degree of parallel processing on a chip. ...

#### 19 Effective Management of DRAM Bandwidth in Multicore Processors

Nauman Rafique, Won-Taek Lim, Mithuna Thottethodi

September 2007 PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007) - Volume 00,

Volume 00

Publisher: IEEE Computer Society

Full text available: pdf(308.25 KB) Additional Information: full citation, abstract

Technology trends are leading to increasing number of cores on chip. All these cores inherently share the DRAM bandwidth. The on-chip cache resources are limited and in many situations, cannot hold the working set of the threads running on all these ...

## 20 Utilizing custom registers in application-specific instruction set processors for register



spills elimination

Hai Lin, Yunsi Fei

March 2007 **GLSVLSI '07:** Proceedings of the 17th great lakes symposium on VLSI

Publisher: ACM

Full text available: pdf(661.15 KB) Additional Information: full citation, abstract, references, index terms

Application-specific instruction set processor (ASIP) has become an important design choice for embedded systems. It can achieve both high flexibility offered by the base processor core and high performance and energy efficiency offered by the dedicated ...

Keywords: ASIP, custom register, register file

Results 1 - 20 of 1,385

Result page: **1**  $\frac{2}{3}$   $\frac{3}{4}$   $\frac{4}{5}$   $\frac{5}{6}$   $\frac{7}{7}$   $\frac{8}{9}$   $\frac{9}{10}$   $\frac{10}{10}$ 

ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

<u>Terms of Usage Privacy Policy Code of Ethics Contact Us</u>

Useful downloads: Adobe Acrobat Q QuickTime Windows Media Player

Subscribe (Full Service) Register (Limited Service, Free) Login

The ACM Digital Library

C The Guide

processor +instruction +set +architecture

neati@f

## ACM DICITAL LIBRARY

Feedback

processor +instruction +set +architecture Terms used: processor instruction set architecture

Found **8,803** of **238,273** 

Sort results by

relevance

Save results to a Binder

Refine these results with Advanced

Try this search in The ACM Guide

Display results

expanded form

Full text available: pdf(78.03 KB)

Copen results in a new window

Results 1 - 20 of 8,803

Result page: **1** <u>2</u> <u>3</u> <u>4</u>

5 7 6

Energy-efficient instruction set synthesis for application-specific

Ads by Google

processors

Jong-eun Lee, Kiyoung Choi, Nikil D. Dutt

August 2003 ISLPED '03: Proceedings of the 2003 international symposium

on Low power electronics and design

Publisher: ACM

Additional Information: full citation, abstract,

references, cited by, index

terms

Several techniques have been proposed to enhance the energy-efficiency of ASIPs (Application-Specific Instruction set Processors). While those techniques can reduce the energy consumption with a minimal change in the instruction set (IS), they fail to ...

**Keywords**: application-specific instruction set processor (ASIP), customization, energy-delay product, instruction encoding, low power

**Document Scanning Service** Free Online Quote. Scan to PDF/TIF Serving the DC Metropolitan Area www.ignitedscanning.com

**GIS Image** <u>Segmentation</u> Shapefiles from satellite imagery Wizard to segment, classify, batch ImageSeg.com

Overcoming the limitations of conventional vector processors

Christos Kozyrakis, David Patterson

May 2003 ISCA '03: ACM SIGARCH Computer Architecture News,

Volume 31 Issue 2

Publisher: ACM

Full text available: pdf(160.23 KB) Additional Information: full citation, abstract,

references, cited by

Despite their superior performance for multimedia applications, vector processors have three limitations that hinder their widespread acceptance. First, the complexity and size of the centralized vector register file limits the number of functional units. ...

Storage Area Networks Free Ciena White Paper [pdf] SAN Integration & **Network Solutions** www.ciena.com

**Live Unified Corp** Models This business tool

delivers what everyone wants live on-demand www.liveunimodels.com

Ease: an environment for architecture study and experimentation

Jack W. Davidson, David B. Whalley

April 1990 SIGMETRICS '90: ACM SIGMETRICS Performance Evaluation Review, Volume 18 Issue 1

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(220.72 KB)

references, cited by, index

terms

Gathering detailed measurements of the execution behavior of an instruction set architecture is difficult. There are two major problems that must be solved. First, for meaningful measurements to be obtained, programs that represent typical work load ...

4 On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance: matrix-multiply revisited

David Parello, Olivier Temam, Jean-Marie Verdun

November 2002 **Supercomputing '02:** Proceedings of the 2002 ACM/IEEE

conference on Supercomputing

Publisher: IEEE Computer Society Press

Additional Information: full citation, abstract,

Full text available: pdf(263.32 KB) references, cited by, index

terms

As the complexity of processor architectures increases, there is a widening gap between peak processor performance and sustained processor performance so that programs now tend to exploit only a fraction of available performance. While there is a tremendous ...

5 Removing communications in clustered microarchitectures through

instruction replication

Alex Aletà, Josep M. Codina, Antonio González, David Kaeli June 2004 **ACM Transactions on Architecture and Code Optimization** (TACO), Volume 1 Issue 2

Publisher: ACM

Full text available: pdf(618.21 KB) Additional Information: full citation, abstract,

references, index terms

The need to communicate values between clusters can result in a significant performance loss for clustered microarchitectures. In this work, we describe an optimization technique that removes communications by selectively replicating an appropriate set ...

**Keywords**: Clustered microarchitectures, ILP, instruction replication, modulo-scheduling, statically scheduled processors

6 A simplified java bytecode compilation system for resource-

constrained embedded processors

Carmen Badea, Alexandru Nicolau, Alexander V. Veidenbaum September 2007 **CASES '07:** Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems

Publisher: ACM

Full text available: pdf(439.70 KB) Additional Information: full citation, abstract, references, index terms

Embedded platforms are resource-constrained systems in whichperformance and memory requirements of executed code are ofcritical importance. However, standard techniques such as full just-in-time(JIT) compilation and/or adaptive optimization (AO) may ...

**Keywords**: adaptive optimization, embedded systems, java virtual machine, profile-guided optimization, superoperators

- 7 The limits of instruction level parallelism in SPEC95 applications
- Matthew A. Postiff, David A. Greene, Gary S. Tyson, Trevor N. Mudge March 1999 ACM SIGARCH Computer Architecture News, Volume 27 Issue

Publisher: ACM

Additional Information: full citation, cited by, index terms

- 8 Low-power, low-complexity instruction issue using compiler
- <u>assistance</u>

Madhavi G. Valluri, Lizy K. John, Kathryn S. McKinley

June 2005 ICS '05: Proceedings of the 19th annual international conference

on Supercomputing

Publisher: ACM

Full text available: pdf(350.14 KB) Additional Information: full citation, abstract, references, cited by

In an out-of-order issue processor, instructions are dynamically reordered and issued to function units in their data-ready order rather than their original program order to achieve high performance. The logic that facilitates dynamic issue is one of ...

- <sup>9</sup> The interaction of architecture and operating system design
- Thomas E. Anderson, Henry M. Levy, Brian N. Bershad, Edward D. Lazowska

April 1991 **ASPLOS-IV:** Proceedings of the fourth international conference on Architectural support for programming languages and operating systems

Publisher: ACM

Full text available: pdf(1.60 MB) Additional Information: full citation, references, cited by, index terms

- 10 A self-organizing defect tolerant SIMD architecture
- Jaidev Patwardhan, Chris Dwyer, Alvin R. Lebeck
  July 2007 ACM Journal on Emerging Technologies in Computing
  Systems (JETC), Volume 3 Issue 2

Publisher: ACM

Full text available: pdf(1.25 MB) Additional Information: full citation, abstract, references, index terms

The continual decrease in transistor size (through either scaled CMOS or emerging nanotechnologies) promises to usher in an era of tera to petascale integration but with increasing defects. Regardless of fabrication methodology (top-down or bottom-up), ...

**Keywords**: DNA, SIMD, Self-organizing, bit-serial, data parallel, defect tolerance, nanocomputing

Synthesis of instruction sets for pipelined microprocessors
 Ing-Jer Huang, Alvin M. Despain
 June 1994 DAC '94: Proceedings of the 31st annual conference on Design

automation

Publisher: ACM

Full text available: pdf(62.24 KB) Additional Information: full citation, references, cited by, index terms

## 12 A Run-Time Reconfigurable Datapath Architecture for Image

Processing Applications

Marcos R. Boschetti, Ivan S. Silva, Sergio Bampi

February 2004 DATE '04: Proceedings of the conference on Design, automation and test in Europe - Volume 3, Volume 3

Publisher: IEEE Computer Society

Full text available: pdf(210.11 KB) Additional Information: full citation, abstract, index

This paper describes a run-time reconfigurable architecture targeted to flexible low-level image processing functions. The purpose is to present the evolution of the DRIP (Dynamically Reconfigurable Image Processor) architecture from a statically configurable ...

#### 13 Proceedings of the nineteenth annual ACM symposium on Parallel

algorithms and architectures Phillip B. Gibbons, Christian Scheideler June 2007 proceeding

Publisher: ACM

Additional Information: full citation, abstract

This volume consists of papers that were presented at the 19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'07), held on June 9-11, 2007, in San Diego, CA, USA. The symposium was part of the 2007 Federated Computing Research ...

14 Enhanced code compression for embedded RISC processors

Keith D. Cooper, Nathaniel McIntosh

May 1999 PLDI '99: ACM SIGPLAN Notices, Volume 34 Issue 5

Publisher: ACM

Full text available: pdf(1.31 MB) Additional Information: full citation, abstract, references, cited by, index terms

This paper explores compiler techniques for reducing the memory needed to load and run program executables. In embedded systems, where economic incentives to reduce both RAM and ROM are strong, the size of compiled code is increasingly important. Similarly, ...

15 Multiscalar as a new architecture paradigm.

Jim Smith

December 1996 ACM Computing Surveys (CSUR), Volume 28 Issue 4es

Publisher: ACM

Full text available: html(17.07 KB) Additional Information: full citation

16 <u>Instruction scheduling for a tiled dataflow architecture</u>

Martha Mercaldi, Steven Swanson, Andrew Petersen, Andrew Putnam, Andrew Schwerin, Mark Oskin, Susan J. Eggers

October 2006 ASPLOS-XII: ACM SIGARCH Computer Architecture

News, Volume 34 Issue 5

Publisher: ACM

Additional Information: full citation, abstract,

Full text available: pdf(490.50 KB)

references, cited by, index

terms

This paper explores hierarchical instruction scheduling for a tiled processor. Our results show that at the top level of the hierarchy, a simple profile-driven algorithm effectively minimizes operand latency. After this schedule has been partitioned ...

**Keywords**: dataflow, instruction scheduling, tiled architectures

17 Power optimization in programmable processors and ASIC

implementations of linear systems: transformation-based approach
Mani Srivastava, Miodrag Potkonjak

June 1996 **DAC '96:** Proceedings of the 33rd annual conference on Design automation

Publisher: ACM

Full text available: pdf(73.41 KB) Additional Information: full citation, references, cited by, index terms

18 Effective compiler generation by architecture description

Stefan Farfeleder, Andreas Krall, Edwin Steiner, Florian Brandner July 2006 LCTES '06: ACM SIGPLAN Notices, Volume 41 Issue 7 Publisher: ACM

Full text available: pdf(128.18 KB) Additional Information: full citation, abstract, references, index terms

Embedded systems have an extremely short time to market and therefore require easily retargetable compilers. Architecture description languages (ADLs) provide a single concise architecture specification for the generation of hardware, instruction set ...

**Keywords**: architecture description language, code generation, compiler generation

19 A taxonomy of display processors

Ulrich Trambacz, Georg Hyla

January 1976 ISCA '76: ACM SIGARCH Computer Architecture News, Volume 4 Issue 4

Publisher: ACM

Full text available: pdf(110.19 KB) Additional Information: full citation, abstract, index terms

A potential customer examining computer graphics systems including a random positioning and refreshed CRT needs a lot of time and effort to form an opinion of the various marketed systems. To him not only systems appear very different, but also manuals ...

20 Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies

Lieven Eeckhout, Robert H. Bell Jr., Bastiaan Stougie, Koen De Bosschere, Lizy K. John

June 2004 ISCA '04: Proceedings of the 31st annual international

symposium on Computer architecture

Publisher: IEEE Computer Society

Full text available: pdf(228.94 KB) Additional Information: full citation, abstract, references, cited by

Designing a new microprocessor is extremely time-consuming. One of the contributing reasons is that computerdesigners rely heavily on detailed architectural simulations, which are very time-consuming. Recent workhas focused on statistical simulation to ...

Results 1 - 20 of 8,803

Result page:  $1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 \ 8 \ 9 \ 10 \ next >>$ 

The ACM Portal is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

<u>Terms of Usage Privacy Policy Code of Ethics Contact Us</u>

Useful downloads: Adobe Acrobat QuickTime Windows Media Player